kreuzberg / liter-llm
High-performance LLM client for PHP. Unified interface for streaming, tool calling, and provider routing across OpenAI, Anthropic, and 142+ providers. Powered by Rust core.
Package info
github.com/kreuzberg-dev/liter-llm
Language:Rust
Type:php-ext
Ext name:ext-liter_llm_php
pkg:composer/kreuzberg/liter-llm
Requires
- php: ^8.2
Requires (Dev)
- friendsofphp/php-cs-fixer: ^3.94
- phpstan/phpstan: ^2.1
- phpunit/phpunit: ^11.0
Replaces
- ext-liter_llm_php: *
This package is auto-updated.
Last update: 2026-03-29 19:01:18 UTC
README
A lighter, faster, safer universal LLM API client -- one Rust core, 11 native language bindings, 142 providers.
Why liter-llm?
A universal LLM API client, compiled from the ground up in Rust. No interpreter, no transitive dependency tree, no supply chain surface area. One binary, 11 native language bindings, 142 providers.
- Compiled Rust core. No
pip installsupply chain. No.pthauto-execution hooks. No runtime dependency tree to compromise. The kind of supply chain attack that hit litellm in 2026 is structurally impossible here. - Secrets stay secret. API keys are wrapped in
secrecy::SecretString-- zeroed on drop, redacted in logs, never serialized. - Polyglot from day one. Python, TypeScript, Go, Java, Ruby, PHP, C#, Elixir, WebAssembly, C/FFI -- all thin wrappers around the same Rust core. No reimplementation drift.
- Observability built in. Production-grade OpenTelemetry with GenAI semantic conventions -- not an afterthought callback system.
- Composable middleware. Rate limiting, caching, cost tracking, health checks, and fallback as Tower layers you stack like building blocks.
We give credit to litellm for proving the category -- our provider registry was bootstrapped from theirs. See ATTRIBUTIONS.md.
Feature Comparison
An honest look at where things stand. We're newer and leaner -- litellm has breadth we haven't matched yet, and we have depth they can't easily retrofit.
| liter-llm | litellm | |
|---|---|---|
| Language | Rust (compiled, memory-safe) | Python |
| Bindings | 11 native (Rust, Python, TS, Go, Java, Ruby, PHP, C#, Elixir, WASM, C) | Python (+ OpenAI-compatible proxy) |
| Providers | 142 (compiled at build time) | 100+ (runtime resolution) |
| Streaming | SSE + AWS EventStream binary protocol | SSE + AWS EventStream |
| Observability | Built-in OpenTelemetry (GenAI semconv) | 40+ callback integrations |
| API key safety | secrecy::SecretString (zeroed, redacted) |
Plain strings |
| Middleware | Composable Tower stack | Built-in callback system |
| Proxy / Gateway | Yes (22 OpenAI-compatible endpoints, 35MB Docker) | Yes |
| Guardrails | -- | 10+ integrations, 4 execution modes (advanced: enterprise) |
| Semantic caching | -- | Redis + Qdrant backends |
| Virtual key mgmt | Yes (per-key model restrictions, RPM/TPM, budgets) | Yes (key rotation: enterprise) |
| Management API | Config-driven (REST admin API planned) | Multi-tenant (teams, budgets, keys; tiers + reporting: enterprise) |
| Fine-tuning API | -- | Enterprise only |
| Load balancer | Fallback + round-robin via Tower router | Full router with strategies |
| Cost tracking | Embedded pricing + OTEL spans | Per-key/team/model budgets |
| Rate limiting | Per-model RPM/TPM (Tower layer) | Per-key/user/team/model |
| Caching | In-memory LRU + 40+ backends via OpenDAL (S3, Redis, GCS, DynamoDB, disk, ...) | 7 backends (Redis, S3, GCS, disk, Qdrant) |
| Tool calling | Parallel tools, structured output, JSON schema | Full support |
| Embeddings | Yes | Yes |
| Batch API | Yes | Yes |
| Audio / Speech | Yes | Yes |
| Lifecycle hooks | onRequest/onResponse/onError per-client | Callback integrations |
| Budget enforcement | Per-model + global limits, hard/soft modes | Per-key/team budgets |
| Health checks | Automatic provider probes + cooldown | -- |
| Custom providers | Runtime API + TOML config file | Config + code-based |
| Config files | TOML with auto-discovery (liter-llm.toml) |
YAML proxy config |
| Search / OCR | 12 search + 4 OCR providers | Yes |
| Image generation | Yes | Yes |
Key Features
- 142 providers -- OpenAI, Anthropic, Google, AWS Bedrock, Groq, Mistral, Together AI, Fireworks, Perplexity, DeepSeek, Cohere, and 130+ more
- 11 native bindings -- Rust, Python, TypeScript/Node.js, Go, Java, Ruby, PHP, C#, Elixir, WebAssembly, C/FFI
- First-class streaming -- SSE and AWS EventStream binary protocol with zero-copy buffers
- TOML configuration --
liter-llm.tomlwith auto-discovery, custom providers, cache backends, middleware config - OpenTelemetry -- GenAI semantic conventions, cost tracking spans, HTTP-level tracing
- Tower middleware -- Rate limiting, caching (40+ OpenDAL backends), cost tracking, budget enforcement, health checks, cooldowns, hooks, fallback -- all composable
- Search & OCR -- Web search across 12 providers, document OCR across 4 providers
- Tool calling -- Parallel tools, structured outputs, JSON schema validation
- Embeddings -- Dimension selection, base64 format, multi-provider support
- Per-request routing -- Automatic provider detection from model name prefix, custom provider registration at runtime
- Schema-driven -- Provider registry and API types compiled from JSON schemas, no runtime lookups
Proxy Server & Docker
Drop-in replacement for litellm's proxy -- 22 OpenAI-compatible endpoints in a 35MB Docker image:
# Start the proxy docker run -p 4000:4000 -e LITER_LLM_MASTER_KEY=sk-your-key ghcr.io/kreuzberg-dev/liter-llm # Use it like OpenAI curl http://localhost:4000/v1/chat/completions \ -H "Authorization: Bearer sk-your-key" \ -d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Hello"}]}'
Or with a TOML config file:
# liter-llm-proxy.toml [general] master_key = "${LITER_LLM_MASTER_KEY}" [[models]] name = "gpt-4o" provider_model = "openai/gpt-4o" api_key = "${OPENAI_API_KEY}" [[models]] name = "claude-sonnet" provider_model = "anthropic/claude-sonnet-4-20250514" api_key = "${ANTHROPIC_API_KEY}" [[keys]] key = "sk-team-a" models = ["gpt-4o"] rpm = 100
CLI:
liter-llm api --config liter-llm-proxy.toml # Start proxy server liter-llm mcp --transport stdio # Start MCP tool server
Features: Model routing, virtual API keys, per-key rate limiting (RPM/TPM), cost tracking, budget enforcement, response caching, SSE streaming, OpenAPI 3.1 spec at /openapi.json, MCP server with 22 tools, graceful shutdown.
Architecture
liter-llm/
├── crates/
│ ├── liter-llm/ # Rust core library
│ ├── liter-llm-py/ # Python (PyO3) core
│ ├── liter-llm-node/ # Node.js (NAPI-RS) core
│ ├── liter-llm-ffi/ # C-compatible FFI layer
│ ├── liter-llm-php/ # PHP (ext-php-rs) core
│ └── liter-llm-wasm/ # WebAssembly (wasm-bindgen) core
├── packages/
│ ├── python/ # Python package
│ ├── typescript/ # TypeScript/Node.js package
│ ├── go/ # Go (cgo) module
│ ├── java/ # Java (Panama FFI) package
│ ├── ruby/ # Ruby (Magnus) gem
│ ├── elixir/ # Elixir (Rustler NIF) package
│ ├── csharp/ # .NET (P/Invoke) package
│ └── php/ # PHP (Composer) package
└── schemas/ # Provider registry and API schemas
Quick Start
Install in your language of choice:
| Language | Install |
|---|---|
| Python | pip install liter-llm |
| Node.js | pnpm add @kreuzberg/liter-llm |
| Rust | cargo add liter-llm |
| Go | go get github.com/kreuzberg-dev/liter-llm/packages/go |
| Java | dev.kreuzberg:liter-llm (Maven/Gradle) |
| Ruby | gem install liter_llm |
| PHP | composer require kreuzberg/liter-llm |
| C# | dotnet add package LiterLlm |
| Elixir | {:liter_llm, "~> 1.0"} in mix.exs |
| WASM | pnpm add @kreuzberg/liter-llm-wasm |
| C/FFI | Build from source -- see FFI crate |
Usage
import asyncio, os from liter_llm import LlmClient async def main(): client = LlmClient(api_key=os.environ["OPENAI_API_KEY"]) # Chat with any provider using the provider/model prefix response = await client.chat( model="openai/gpt-4o", messages=[{"role": "user", "content": "Hello!"}], ) print(response.choices[0].message.content) # Switch providers by changing the prefix -- no other code changes client2 = LlmClient(api_key=os.environ["ANTHROPIC_API_KEY"]) response = await client2.chat( model="anthropic/claude-sonnet-4-20250514", messages=[{"role": "user", "content": "Hello!"}], ) print(response.choices[0].message.content) asyncio.run(main())
Or use a liter-llm.toml config file instead of passing everything in code:
api_key = "sk-..." timeout_secs = 120 [cache] max_entries = 512 ttl_seconds = 600 backend = "redis" backend_config = { connection_string = "redis://localhost:6379" } [budget] global_limit = 50.0 enforcement = "hard" [[providers]] name = "my-provider" base_url = "https://my-llm.example.com/v1" model_prefixes = ["my-provider/"]
The same API is available in all 11 languages -- see the language READMEs below for idiomatic examples.
Core API
All bindings expose a unified chat() function:
| Language | Usage |
|---|---|
| Rust | DefaultClient::new(config).chat(messages, options).await |
| Python | LlmClient(api_key=...).chat(messages, config) |
| Node.js | new LlmClient({ apiKey }).chat(messages, config) |
| Go | client.Chat(ctx, messages, config) |
| Java | client.chat(messages, configJson) |
| Ruby | LiterLlm::LlmClient.new(api_key, config).chat(messages) |
| Elixir | LiterLlm.chat(messages, config) |
| PHP | LiterLlm\LlmClient::new($apiKey)->chat($messages, $config) |
| C# | new LlmClient(apiKey).ChatAsync(messages, config) |
| WASM | new LlmClient({ apiKey }).chat(messages, config) |
| C FFI | liter_llm_chat(client, messages_json, config_json) |
Language READMEs
| Language | README | Binding |
|---|---|---|
| Python | packages/python | PyO3 |
| TypeScript / Node.js | crates/liter-llm-node | NAPI-RS |
| Go | packages/go | cgo |
| Java | packages/java | Panama FFI |
| Ruby | packages/ruby | Magnus |
| Elixir | packages/elixir | Rustler NIF |
| PHP | packages/php | ext-php-rs |
| .NET (C#) | packages/csharp | P/Invoke |
| WebAssembly | crates/liter-llm-wasm | wasm-bindgen |
| C/C++ (FFI) | crates/liter-llm-ffi | C ABI |
Part of kreuzberg.dev
liter-llm is built by the kreuzberg.dev team -- the same people behind Kreuzberg (document extraction for 91+ formats), tree-sitter-language-pack (multilingual parsing), and html-to-markdown. All our libraries share the same Rust-core, polyglot-bindings architecture. Visit kreuzberg.dev or find us on GitHub.
Contributing
Contributions are welcome! See CONTRIBUTING.md for guidelines.
Join our Discord community for questions and discussion.
License
MIT -- see LICENSE for details.